NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Autocovariance function estimation via difference schemes for a semiparametric change point model with m‐dependent errors

https://doi.org/10.1111/anzs.70002

Levine, Michael; Tecuapetla‐Gómez, Inder (April 2025, Australian & New Zealand Journal of Statistics)

We discuss a broad class of difference‐based estimators of the autocovariance function in a semiparametric regression model where the signal consists of the sum of a smooth function and another stepwise function whose number of jumps and locations are unknown (change points) while the errors are stationary and ‐dependent. We establish that the influence of the smooth part of the signal over the bias of our estimators is negligible; this is a general result as it does not depend on the distribution of the errors. We show that the influence of the unknown smooth function is negligible also in the mean squared error (MSE) of our estimators. Although we assumed Gaussian errors to derive the latter result, our finite sample studies suggest that the class of proposed estimators still show small MSE when the errors are not Gaussian. Our simulation study also demonstrates that, when the error process is mis‐specified as an AR instead of an ‐dependent process, our proposed method can estimate autocovariances about as well as some methods specifically designed for the AR(1) case, and sometimes even better than them. We also allow both the number of change points and the magnitude of the largest jump grow with the sample size . In this case, we provide conditions on the interplay between the growth rate of these two quantities as well as the vanishing rate of the modulus of continuity (of the signal's smooth part) that ensure consistency of our autocovariance estimators. As an application, we use our approach to provide a better understanding of the possible autocovariance structure of a time series of global averaged annual temperature anomalies. Finally, the R package dbacf complements this article.
more » « less
Free, publicly-accessible full text available April 29, 2026
A smoothed semiparametric likelihood for estimation of nonparametric finite mixture models with a copula-based dependence structure

https://doi.org/10.1007/s00180-024-01483-4

Levine, Michael; Mazo, Gildas (June 2024, Computational Statistics)

Full Text Available
Nonparametric clustering of RNA-sequencing data

https://doi.org/10.1002/sam.11638

Lozano, Gabriel; Atallah, Nadia; Levine, Michael (December 2023, Statistical Analysis and Data Mining: The ASA Data Science Journal)

Abstract Identification of clusters of co‐expressed genes in transcriptomic data is a difficult task. Most algorithms used for this purpose can be classified into two broad categories: distance‐based or model‐based approaches. Distance‐based approaches typically utilize a distance function between pairs of data objects and group similar objects together into clusters. Model‐based approaches are based on using the mixture‐modeling framework. Compared to distance‐based approaches, model‐based approaches offer better interpretability because each cluster can be explicitly characterized in terms of the proposed model. However, these models present a particular difficulty in identifying a correct multivariate distribution that a mixture can be based upon. In this manuscript, we review some of the approaches used to select a distribution for the needed mixture model first. Then, we propose avoiding this problem altogether by using a nonparametric MSL (maximum smoothed likelihood) algorithm. This algorithm was proposed earlier in statistical literature but has not been, to the best of our knowledge, applied to transcriptomics data. The salient feature of this approach is that it avoids explicit specification of distributions of individual biological samples altogether, thus making the task of a practitioner easier. We performed both a simulation study and an application of the proposed algorithm to two different real datasets. When used on a real dataset, the algorithm produces a large number of biologically meaningful clusters and performs at least as well as several other mixture‐based algorithms commonly used for RNA‐seq data clustering. Our results also show that this algorithm is capable of uncovering clustering solutions that may go unnoticed by several other model‐based clustering algorithms. Our code is publicly available on Github at https://github.com/Matematikoi/non_parametric_clustering
more » « less
Full Text Available
The allosteric mechanism leading to an open-groove lipid conductive state of the TMEM16F scramblase

https://doi.org/10.1038/s42003-022-03930-8

Khelashvili, George; Kots, Ekaterina; Cheng, Xiaolu; Levine, Michael V.; Weinstein, Harel (December 2022, Communications Biology)

Abstract TMEM16F is a Ca²⁺-activated phospholipid scramblase in the TMEM16 family of membrane proteins. Unlike other TMEM16s exhibiting a membrane-exposed hydrophilic groove that serves as a translocation pathway for lipids, the experimentally determined structures of TMEM16F shows the groove in a closed conformation even under conditions of maximal scramblase activity. It is currently unknown if/how TMEM16F groove can open for lipid scrambling. Here we describe the analysis of ~400 µs all-atom molecular dynamics (MD) simulations of the TMEM16F revealing an allosteric mechanism leading to an open-groove, lipid scrambling competent state of the protein. The groove opens into a continuous hydrophilic conduit that is highly similar in structure to that seen in other activated scramblases. The allosteric pathway connects this opening to an observed destabilization of the Ca²⁺ion bound at the distal site near the dimer interface, to the dynamics of specific protein regions that produces the open-groove state to scramble phospholipids.
more » « less
Full Text Available
Minimax optimal estimation in partially linear additive models under high dimension

https://doi.org/10.3150/18-BEJ1021

Yu, Zhuqing; Levine, Michael; Cheng, Guang (May 2019, Bernoulli)

Full Text Available
Gain Modulation by Corticostriatal and Thalamostriatal Input Signals during Reward-Conditioned Behavior

https://doi.org/10.1016/j.celrep.2019.10.060

Lee, Kwang; Bakhurin, Konstantin I.; Claar, Leslie D.; Holley, Sandra M.; Chong, Natalie C.; Cepeda, Carlos; Levine, Michael S.; Masmanidis, Sotiris C. (November 2019, Cell Reports)

Full Text Available
An MM algorithm for estimation of a two component semiparametric density mixture with a known component

https://doi.org/10.1214/18-EJS1417

Shen, Zhou; Levine, Michael; Shang, Zuofeng (January 2018, Electronic journal of statistics)

Full Text Available
Hunchback is counter-repressed to regulate even-skipped stripe 2 expression in Drosophila embryos

https://doi.org/10.1371/journal.pgen.1007644

Vincent, Ben J.; Staller, Max V.; Lopez-Rivera, Francheska; Bragdon, Meghan D.; Pym, Edward C.; Biette, Kelly M.; Wunderlich, Zeba; Harden, Timothy T.; Estrada, Javier; DePace, Angela H.; et al (September 2018, PLOS Genetics)

Full Text Available

Search for: All records